Exploiting Translational Correspondences for Pattern-Independent MWE Identification

نویسندگان

  • Sina Zarrieß
  • Jonas Kuhn
چکیده

Based on a study of verb translations in the Europarl corpus, we argue that a wide range of MWE patterns can be identified in translations that exhibit a correspondence between a single lexical item in the source language and a group of lexical items in the target language. We show that these correspondences can be reliably detected on dependency-parsed, word-aligned sentences. We propose an extraction method that combines word alignment with syntactic filters and is independent of the structural pattern of the translation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Adaptive Frequency Nonlinear Oscillator for Earning an Energy Efficient Motion Pattern in a Leg- Like Stretchable Pendulum by Exploiting the Resonant Mode

In this paper we investigate a biological framework to generate and adapt a motion pattern so that can be energy efficient. In fact, the motion pattern in legged animals and human emerges among interaction between a central pattern generator neural network called CPG and the musculoskeletal system. Here, we model this neuro - musculoskeletal system by means of a leg - like mechanical system cal...

متن کامل

mwetoolkit: a Framework for Multiword Expression Identification

This paper presents the Multiword Expression Toolkit (mwetoolkit), an environment for type and language-independent MWE identification from corpora. The mwetoolkit provides a targeted list of MWE candidates, extracted and filtered according to a number of user-defined criteria and a set of standard statistical association measures. For generating corpus counts, the toolkit provides both a corpu...

متن کامل

A Repository of Variation Patterns for Multiword Expressions

One of the crucial issues in the analysis and processing of MWEs is their internal variability. Indeed, the feature that mostly characterises MWEs is their fixedness at some level of linguistic analysis, be it morphology, syntax, or semantics. The morphological aspect is not trivial in languages which exhibit a rich morphology, such as Romance languages. The issue is relevant in at least three ...

متن کامل

A Machine Learning Approach for the Identification of Bengali Noun-Noun Compound Multiword Expressions

This paper presents a machine learning approach for identification of Bengali multiword expressions (MWE) which are bigram nominal compounds. Our proposed approach has two steps: (1) candidate extraction using chunk information and various heuristic rules and (2) training the machine learning algorithm called Random Forest to classify the candidates into two groups: bigram nominal compound MWE ...

متن کامل

USzeged: Identifying Verbal Multiword Expressions with POS Tagging and Parsing Techniques

The paper describes our system submitted for the Workshop on PARSEME’s Shared Task on automatic identification of verbal multiword expressions . It uses POS tagging and dependency parsing to identify singleand multi-token verbal MWEs in text. Our system is language-independent and competed on nine of the eighteen languages. Our paper describes how our system works and gives its error analysis f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009